26 research outputs found

    Modeling multiple time units delayed gene regulatory network using dynamic Bayesian network.

    Get PDF

    Effective Proxy for Human Labeling: Ensemble Disagreement Scores in Large Language Models for Industrial NLP

    Full text link
    Large language models (LLMs) have demonstrated significant capability to generalize across a large number of NLP tasks. For industry applications, it is imperative to assess the performance of the LLM on unlabeled production data from time to time to validate for a real-world setting. Human labeling to assess model error requires considerable expense and time delay. Here we demonstrate that ensemble disagreement scores work well as a proxy for human labeling for language models in zero-shot, few-shot, and fine-tuned settings, per our evaluation on keyphrase extraction (KPE) task. We measure fidelity of the results by comparing to true error measured from human labeled ground truth. We contrast with the alternative of using another LLM as a source of machine labels, or silver labels. Results across various languages and domains show disagreement scores provide a better estimation of model performance with mean average error (MAE) as low as 0.4% and on average 13.8% better than using silver labels

    Early classification on temporal sequences

    Get PDF
    Early classification of temporal sequences has applications in, for example, health informatics, intrusion detection, anomaly detection, and scientific and engineering sequence data monitoring. Comparing to learning conventional sequence classifiers, learning early classifiers is a more challenging task and has not been systematically studied before. In this work, we identify the problem of early classification and develop a series of classifiers for temporal sequence early classification. The proposed classifiers are designed for different types of temporal sequences including symbolic sequences and time series. Furthermore, the proposed classifiers have several desirable characteristics which are useful in different application scenarios. We evaluate our approaches on a broad range of real data sets and demonstrate that the classifiers can achieve competitive classification accuracies with great earliness. Also, the classifiers can extract interpretable features from sequences for better understanding

    Consensus Formation Control and Obstacle Avoidance of Multiagent Systems with Directed Topology

    No full text
    This study addresses the problems of formation control and obstacle avoidance for a class of second-order multiagent systems with directed topology. Formation and velocity control laws are designed to solve the formation tracking problem. A new obstacle avoidance control law is also proposed to avoid obstacles. Then, the consensus control protocol consists of the formation, velocity, and obstacle avoidance control laws. The convergence of the proposed control protocol is analyzed by a redesigned Lyapunov function. Finally, the effectiveness of theoretical results is illustrated by simulation examples. The simulation results show that the formation tracking problem of the given multiagent systems can be realized and obstacles can be avoided under the proposed control protocol

    Mining sequence classifiers for early prediction

    No full text
    Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since temporal order is important in sequence data, in many critical applications of sequence classification such as medical diagnosis and disaster prediction, early prediction is a highly desirable feature of sequence classifiers. In early prediction, a sequence classifier should use a prefix of a sequence as short as possible to make a reasonably accurate prediction. To the best of our knowledge, early prediction on sequence data has not been studied systematically. In this paper, we identify the novel problem of mining sequence classifiers for early prediction. We analyze the problem and the challenges. As the first attempt to tackle the problem, we propose two interesting methods. The sequential classification rule (SCR) method mines a set of sequential classification rules as a classifier. A so-called early-prediction utility is defined and used to select features and rules. The generalized sequential decision tree (GSDT) method adopts a divide-and-conquer strategy to generate a classification model. We conduct an extensive empirical evaluation on several real data sets. Interestingly, our two methods achieve accuracy comparable to that of the stateof-the-art methods, but typically need to use only very short prefixes of the sequences. The results clearly indicate that early prediction is highly feasible and effective.
    corecore